State Abstraction Discovery from Irrelevant State Variables

نویسندگان

  • Nicholas K. Jong
  • Peter Stone
چکیده

Abstraction is a powerful form of domain knowledge that allows reinforcement-learning agents to cope with complex environments, but in most cases a human must supply this knowledge. In the absence of such prior knowledge or a given model, we propose an algorithm for the automatic discovery of state abstraction from policies learned in one domain for use in other domains that have similar structure. To this end, we introduce a novel condition for state abstraction in terms of the relevance of state features to optimal behavior, and we exhibit statistical methods that detect this condition robustly. Finally, we show how to apply temporal abstraction to benefit safely from even partial state abstraction in the presence of generalization error.ion is a powerful form of domain knowledge that allows reinforcement-learning agents to cope with complex environments, but in most cases a human must supply this knowledge. In the absence of such prior knowledge or a given model, we propose an algorithm for the automatic discovery of state abstraction from policies learned in one domain for use in other domains that have similar structure. To this end, we introduce a novel condition for state abstraction in terms of the relevance of state features to optimal behavior, and we exhibit statistical methods that detect this condition robustly. Finally, we show how to apply temporal abstraction to benefit safely from even partial state abstraction in the presence of generalization error.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hydrogen Abstraction Reaction of Hydroxyl Radical with 1,1-Dibromoethane and 1,2-Dibromoethane Studied by Using Semi-Classical Transition State Theory

The hydrogen abstraction reaction by OH radical from CH2BrCH2Br (R1) and CH₃CHBr2 (R2) is investigated theoretically by semi-classical transition state theory. The stationary points for both reactions are located by using ωB97X-D and KMLYP density functional methods along with cc-pVTZ basis. Single-point energy calculations are performed at the QCISD(T) and CCSD(T) levels of theory with differe...

متن کامل

Generating Exponentially Smaller POMDP Models Using Conditionally Irrelevant Variable Abstraction

The state of a POMDP can often be factored into a tuple of n state variables. The corresponding flat model, with size exponential in n, may be intractably large. We present a novel method called conditionally irrelevant variable abstraction (CIVA) for losslessly compressing the factored model, which is then expanded into an exponentially smaller flat model in a representation compatible with ma...

متن کامل

Indexed Predicate Discovery for Unbounded System Verification

Predicate abstraction has been proved effective for verifying several infinite-state systems. In predicate abstraction, an abstract system is automatically constructed given a set of predicates. Predicate abstraction coupled with automatic predicate discovery provides for a completely automatic verification scheme. For systems with unbounded integer state variables (e.g. software), counterexamp...

متن کامل

Better Under-Approximation of Programs by Hiding Variables

Abstraction frameworks use under-approximating transitions in order to prove existential properties of concrete systems. Under-approximating transitions refer to the concrete states that correspond to a particular abstract state in a universal manner. For example, there is a must transition from abstract state a to abstract state a′ only if all the concrete states in a have successors in a′. Th...

متن کامل

Learning to Identify Irrelevant State Variables

When they are available, safe state abstractions improve the efficiency of reinforcement learning algorithms by allowing an agent to ignore irrelevant distinctions between states while still learning an optimal policy. Prior work investigated how to incorporate state abstractions into existing algorithms, but most approaches required the user to provide the abstraction. How to discover this kin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005